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O ■ Abstract 

(N - 

Causal models denned in terms of a collection of equations, as defined by Pearl, are 
axiomatized here. Axiomatizations are provided for three successively more general classes 

■ of causal models: (1) the class of recursive theories (those without feedback), (2) the class 
of theories where the solutions to the equations are unique, (3) arbitrary theories (where 
the equations may not have solutions and, if they do, they are not necessarily unique). It 
is shown that to reason about causality in the most general third class, we must extend 
the language used by Galles and Pearl (1997, 1998). In addition, the complexity of the 

HH ' decision procedures is characterized for all the languages and classes of models considered. 

q I 1. Introduction 

The important role of causal reasoning — in prediction, explanation, and counterfactual 
reasoning — has been argued eloquently in a number of recent papers and books (Chajewska 
& Halpern, 1997; Heckerman & Shachter, 1995; Henrion & Druzdzel, 1990; Druzdzel & 
Simon, 1993; Pearl, 1995; Pearl & Verma, 1991; Spirtes, Glymour, & Scheines, 1993). If 

■ we are to reason about causality, then it is certainly useful to find axioms that characterize 
such reasoning. The way we go about axiomatizing causal reasoning depends on two critical 

^ \ factors: 

(j . • how we model causality, and 

> : 

■ • the language that we use to reason about it. 

In this paper, I consider one approach to modeling causality, using structural equations. 
The use of structural equations as a model for causality is standard in the social sciences, 
and seems to go back to the work of Sewall Wright in the 1920s (see (Goldberger, 1972) for a 
discussion); the particular framework that I use here is due to Pearl (1995). Galles and Pearl 
(1997) introduce some axioms for causal reasoning in this framework; they also provide a 
complete axiomatic characterization of reasoning about causality in this framework, under 
the strong assumption that there is a fixed, given causal ordering -< of the equations (Galles 
&: Pearl, 1998). Roughly speaking, this means there is a way of ordering the variables that 
appear in the equations and we have explicit axioms that say Xj has no influence of Xi if 
Xi -< Xj in this causal ordering. 

In this paper, I extend the results of Galles and Pearl by providing a complete axiomatic 
characterization for three increasingly general classes of causal models (defined by structural 
equations): 
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1. the class of recursive theories (those without feedback — this generalizes the situation 
considered by Galles and Pearl (1998), since every fixed causal ordering of the variables 
gives rise to a recursive theory) , 

2. the class of theories where the solutions to the equations are unique, 

3. arbitrary theories (where the equations may not have solutions and, if they do, they 
are not necessarily unique). 

In the process, I clarify some problems in the Galles-Pearl completeness proof that arise 
from the lack of propositional connectives (particularly disjunction) in the language they 
consider and, more generally, highlight the role of the language in reasoning about causality. 
I also characterize the complexity of the decision problem for all these languages and classes 
of models. 

The rest of the paper is organized as follows. In Section 2, 1 give the syntax and semantics 
of the languages I will be considering and review the definition of modifiable causal models. 
In Section 3, I present the complete axiomatizations. In Section 4 I consider the complexity 
of the decision procedure. I conclude in Section 5. 

2. Syntax and Semantics 

An axiomatization is given with respect to a particular language and a class of models, so 
we need to make both precise. Both the language and models I use are based on those 
considered by Galles and Pearl (1997, 1998). To make comparisons easier, I use their 
notation as much as possible. I start with the semantic model, since it motivates some of 
the choices in the syntax, then give the syntax, and finally define the semantics of formulas. 

2.1 Causal Models 

The basic picture here is that we are interested in the values of random variables, some of 
which have a causal effect on others. This effect is modeled by a set of structural equations. 

In practice, it seems useful to split the random variables into two sets, the exogenous 
variables, whose values are determined by factors outside the model, and the endogenous 
variables. It is these endogenous variables whose values are described by the structural 
equations. 

More formally, a signature S is a tuple (U, V,TZ}, where U is a finite set of exogenous 
variables, V is a finite set of endogenous variables, and 1Z associates with every variable 
Y GWUVa nonempty set TZ(Y) of possible values for Y (the range of possible values of Y). 
Unless explicitly noted otherwise, I assume that TZ(Y) is a finite set for each Y £UL)V and 
1 7^- 00 1 > 2. The assumption that U and V are finite is relatively innocuous; as we shall see, 
the assumption that 7Z(Y) is finite has more of an impact on the axioms. The assumption 
that 1 72. 00 1 — ^ allows us to ignore the trivial situation where |72(Y)| = 1. If |72(Y)| = 1, 
we can just remove the variable Y from the signature without loss of expressiveness. 

A causal model over signature S is a tuple T = (S, T) where T associates with each 
variable leVa function denoted Fx such that Fx '■ (xu^ulZiU)) x (xygv-{x}^(^)) ~^ 
7Z(X). Fx tells us the value of X given the values of all the other variables in U U V. We 
think of the functions Fx as defining a set of (modifiable) structural equations, relating the 
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values of the variables. Because Fx is a function, there is a unique value of X once we have 
set all the other variables. Notice we have such functions only for the endogenous variables. 
The exogenous variables are taken as given; it is their effect on the endogenous variables 
(and the effect of the endogenous variables on each other) that we are modeling with the 
structural equations. 

Given a causal model T = (S, J-) over signature S, a (possibly empty) vector X of 
variables in V, and vectors x and u of values for the variables in X and hi, respectively, we 
can define a new causal model denoted T^^(u) over the signature = (0, V— X, 1Z\ V _ x )- 1 

Intuitively, this is the causal model that results when the variables in X are set to x and the 
variables in hi are set to u. Formally, T x ^(u) = (S x , J :X ^ X,U }), where F*^ x > u [ s obtained 
from Fy by setting the values of the variables in X to x and the values of the variables in hi 
to u. The causal model Tj^^_-[u) is called a submodel of T by Pearl (1999). It can describe a 
possible counter I actual situation; that is, even though, under normal circumstances, setting 
the exogenous variables to u may result in the variables X having values x! / x, this 
submodel describes what happens if they are set to x due to some "external action" , the 
cause of which is not modeled explicitly. For example, to determine if the manufacturer 
is at fault in an accident that involved a poorly maintained car, we may want to consider 
what would have happened had the car been well maintained. If there is a random variable 
in the signature that describes how well maintained the car is, then this means examining 
the submodel where that random variable is set to 1 (the car is well maintained). It is this 
ability to examine counterfactual situations that makes causal structures a useful tool for 
reasoning about causality. 

Notice that, in general, there may not be a unique vector of values that simultaneously 
satisfies the equations in Tg^-lu); indeed, there may not be a solution at all. One special 
case where there is guaranteed to be such a unique solution is if there is some total ordering 
-< of the variables in V such that if X -< Y, then Fx is independent of the value of Y; 
i.e., Fx(- ■ ■ , y, ■ ■ ■) = Fx(- ■ ■ , y', . . .) for all y, y' <G Tt(Y). In this case, the causal model is 
said to be recursive or acyclic. Intuitively, if the theory is recursive, there is no feedback. 
If X -< Y, then the value of X may affect the value of Y, but the value of Y has no effect 
on the value of X. 

It should be clear that if T is a recursive theory, then there is always a unique solution 
to the equations in T-^^_-.(u), for all X, x, and u. (We simply solve for the variables in the 
order given by -<.) On the other hand, as the following example shows, it is not hard to 
construct nonrecursive theories for which there is always a unique solution to the equations 
that arise. 

Example 2.1: Let S = (Q,{X,Y},K}), where K(X) = K{Y) = {-1,0,1}, and let T = 
(S,J-), where Fx is characterized by the equation X = Y and Fy is characterized by the 
equation Y = —X (that is, Fx{y) = y and Fy(x) = —x). Clearly T is not recursive; the 
value of X depends on the value of Y and the value of Y depends on that of X. Nevertheless, 
it is easy to see that T has the unique solution X = 0, Y = 0, Tx^ x has the unique solution 
Y = —x, and Ty<~ y has the unique solution X = y. | 

1. I am implicitly identifying the vector X with the subset of V consisting of the variables in X. I do this 
throughout the paper. 72-| v _^ is the restriction of 1Z to the variables in V — X. 
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In this paper, I consider three successively more general classes of causal models for a 
given signature S = (U, V,1Z). 

1. 7^ cc (5): the class of recursive causal models over signature S, 

2. T un i q (S): the class of causal models T over S where for all X C V, x, and u, the 
equations in Tg^(u) have a unique solution, 

3. T{S): the class of all causal models over S. 

I often omit the signature S when it is clear from context or irrelevant, but the reader 
should bear in mind its important role. 

Why should we be interested in causal models that do not possess unique solutions? 
Are there real causal systems that do not possess unique solutions? The issue of whether 
nonrecursive system can be given a causal interpretation is discussed at some length by 
Strotz and Wold (1960). They argue that there are reasonable ways of interpreting causal 
interpretations where the answer is yes. Under these interpretations, there may well be 
more than one solution to the equations. Perhaps the best way to view such equations is to 
think of the variables in V as being mutually interdependent; changing any one of them may 
cause a change in the others. (Think of demand and supply in economics or populations of 
rabbits and wolves.) The solutions to the equations then represent equilibrium situations. 
If there is more than one equilibrium, there will be more than one solution to the equations. 
Of course, if there are no equilibria, then there will be no solutions to the equations. 

A related way of thinking about these equations is that they represent atemporal versions 
of temporal causal equations. That is, suppose that we replace every variable Y E U U V by 
a family of variables Yq, Yi, Y2, ■ ■ -j where, intuitively, Yj represents the value of Y at time t. 
Each equation fx € T is then replaced by a family of equations fx t , where fx t depends only 
on exogenous variables Up with t' < t and endogenous variables Y t > with t' < t. This gives 
us a recursive system. The values of X t under some setting of the variables with subscript 
represents the evolution of X under that setting of the variables. If Xt eventually stabilizes, 
then we might expect the equilibrium value to be the value of X in some solution to the 
original set of equations. If X t stabilizes, then there would not be a solution to the original 
set of equations. 

2.2 Syntax 

I focus here on two languages. Both languages are parameterized by a signature S. The 
first language, C + (S), borrows ideas from dynamic logic (Harel, 1979). Again, I often write 
C + rather than C + (S) (and similarly for the other languages defined below) to simplify the 
notation. A basic causal formula is one of the form \Y\ *— yi, . . . ,Yk *— yk]^>, where if is 
a Boolean combination of formulas of the form X(u) = x, Y±, . . . , Yj., X are variables in V, 
Y\, . . . ,Yk are distinct, x € 1Z(X), and u is a vector of values for all the variables in U. I 
typically abbreviate such a formula as [Y <— y\ip. The special case where k = (which is 
allowed) is abbreviated as [true]ip. [Y <— y\X(u) = x can be interpreted as "in all possible 
solutions to the structural equations obtained after setting Y^ to yi, i = l,...,k, and the 
exogenous variables to u, random variable X has value x". As we shall see, this formula 
is true in a causal model T if in all solutions to the equations in Tf-(u), the random 
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variable X has value x. Note that this formula is trivially true if there are no solutions 
to the structural equations. A causal formula is a Boolean combination of basic causal 
formulas. 

Just as with dynamic logic, we can also define the formula (Y <— y)(X(u) = x) to be an 
abbreviation of -.[? <- y\^(X(u) = x). (Y <- = x) is the dual of [Y «- y\(X(u) = 

x); it is true if, in some solution to the structural equations obtained after setting Yi to 
m, i = 1, . . . , k, and the exogenous variables to u, random variable X has value x. Taking 
true(u) to be an abbreviation for X(u) = x V X(u) / x for some variable X and x G TZ(X), 
and taking false(u) to be an abbreviation for ^true(u), we have that (Y <— y)true(u) is true 
if there is some solution to the equations obtained by setting Y^ to yj, i = l,...,k, and 
the variables in U to u (since <— y\false(u) says that in every solution to the equations 
obtained by setting Fj to y, and W to u, the formula false(u) is true, and thus holds exactly 
if the equations have no solution). 

Let £ un i q (<S) be the sublanguage of £ + (S) which consists of Boolean combinations of 
formulas of the form [Y <— y\X(u) = x. Thus, the difference between £ un i q and C + is 
that in £ U niq, only X(u) = x is allowed after [Y <— y], while in £ + , arbitrary Boolean 
combinations of formulas of the form = x are allowed. As we shall see, for reasoning 

about causality in T^niq, the language £ un i q is adequate, since it is equivalent in expressive 
power to C + . However, this is no longer the case when reasoning about causality in T. 

Following Galles and Pearl's notation, I often write [Y <— y\X{u) = x as Xyp^(u) = x. 

If Y is clear from context or irrelevant, I further abbreviate this as X^j{u) = x. (This is 
actually the notation used by Galles and Pearl.) Let Cq^{S) be the sublanguage of C un ^(S) 
consisting of just conjunctions of formulas of the form Xg{u) = x. In particular, it does not 
contain disjunctions or negations of such formulas. Although Galles and Pearl (1998) are 
not explicit about the language they are using, it seems to be £gp- 2 

2.3 Semantics 

A formula in C + (S) is true or false in a causal model in T(S). As usual, we write T \= tp 
if the causal formula <p is true in causal model T. For a basic causal formula, we have 
T \= [Y <— y\(X{u) = x) if in all solutions to T^^g(u) (i.e., in all vectors of values for the 

variables in V — Y that simultaneously satisfy all the equations F% *~ y , for Z G V — Y), the 
variable X has value x. We define the truth value of arbitrary causal formulas, which are 
just Boolean combinations of basic causal formulas, in the obvious way: 

• T |= if i A if 2 if T \= if i and T \= if 2 

• T |= -nf if T y= f. 

As usual, we say that a formula f is valid with respect to a class T of causal models if 
T |= f for all T G T . 

I can now make precise the earlier claim that in 7^ n iq (and hence T rcc ), the language 
-Cuniq is just as expressive as the full language C + . 

Lemma 2.2: The following formulas are valid in T^niq-' 
2. This was confirmed by Judea Pearl [private communication, 1997]. 
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(a) T uniq |= [Y <- y\(ip V V) ^ [Y <- y\ip V [Y <- y\^, 

(b) T uniq |= [Y <- y\{ip A V) & [Y <- y\ip A [Y <- 
fcj T uniq |= [Y <- y\^p ^ -n[T <- jfa. 

Hence, in T un i q , every formula in C + is equivalent to a formula in £ U niq- 

Proof: Straightforward; left to the reader. | 

Note that it follows from these equivalences that in 7^ n i q , [Y <— y\ip is equivalent to 
(Y <— y)ip. It is also worth noting that Lemma 2.2(b) holds in arbitrary causal models in 
T, not just in T un { q . However, parts (a) and (c) do not, as the following example shows. 

Example 2.3: Let S = $,{X,Y},K), where K(X) = K{Y) = {0,1}; let T = {S,T), 
where Fx is characterized by the equation X = Y and Fy is characterized by the equation 
Y = X. Clearly T ^ T un iq', both (0,0) and (1, 1) are solutions to T. It is easy to see that 
T |= [true](X = V X = 1) A ->[true](X = 0) A ->[true](X = 1), showing that part (a) of 
Lemma 2.2 does not hold in T, and that T \= -<[true]{X = 1) A ->[true\-i(X = 1), showing 
that part (c) does not hold either. | 

3. Complete Axiomatizations 

I briefly recall some standard definitions from logic. An axiom system AX consists of a 
collection of axioms and inference rules. An axiom is a formula (in some predetermined lan- 
guage C), and an inference rule has the form "from (p±, . . . , (p^ infer tp," where <pi, . . . , <pk, ip 
are formulas in C. A proof in AX consists of a sequence of formulas in C, each of which is 
either an axiom in AX or follows by an application of an inference rule. A proof is said to 
be a proof of the formula ip if the last formula in the proof is ip. We say ip is provable in AX, 
and write AX h ip, if there is a proof of ip in AX; similarly, we say that ip is consistent with 
AX if -up is not provable in AX. 

An axiom system AX is said to be sound for a language C with respect to a class T 
of causal models if every formula in C provable in AX is valid with respect to T . AX is 
complete for C with respect to T if every formula in C that is valid with respect to T is 
provable in AX. 

We now want to find axioms that characterize the classes of causal models in which we 
are interested, namely T rec , T un i q , and T. To deal with T rec , it is helpful to define Y ~> Z, 
read "Y affects Z" , as an abbreviation for the formula 

^ XQV,xe>cxev^(X),yen(y),ue>cueu^{U),z^z'en(Z)^ Z£ y^ = z ' A z s{v) = z )- 

Thus, Y affects Z if there is some setting of the exogenous variables and some other endoge- 
nous variables for which changing the value of Y changes the value of Z. This definition is 
used in axiom C6 below, which characterizes recursiveness. 
Consider the following axioms: 

CO. All instances of propositional tautologies. 

CI. X g {u) = x^ Xg{u) / x' if x, x' £ K(X), x + x' 



(equality) 
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C2. V xe iz(x)X${u) = x 

C3. {W s {u) = w A l^(u) = y) => Y £w {u) = y 



(composition) 



(definiteness) 



C4. X xiS (u) = x 



(effectiveness) 



C5. {Y Sw (u) = y A W Sy {u) =w)=> Y s {u) = y 

C6. (X ~» Xx A . . . A AVi ~» X fc ) -.(X fc ~> X ) 



(recursiveness) 



(reversibility) 



We have one rule of inference: 



MP. From (p and ip ip, infer tjj 



(modus ponens) 



CI just states an obvious property of equality: if X = x for every solution of the 
equations in Tg(u), then we cannot have X = x', if x' / x. 3 In a richer language, this could 
have been expressed as (Xg(u) = iAlj(a) = x') (x = x'), but this formula is not in C + 
(since C + does not include expressions such ctS X — X ). C2 states that there is some value 
x € Tt{X) which is the value of X in all solutions to the equations in Tg(u). C2 is not valid 
in T, but it is valid in T un i q . Note that in stating C2, I am making use of the fact that 
TZ{X) is finite (otherwise C2 would involve an infinite disjunction, and would no longer be a 
formula in £ un i q ). In fact, it can be shown that if we allow signatures where the sets TZ(X) 
are infinite, we include C2 only for those random variables X such that 1Z{X) is finite. 4 
C3-C5 were introduced by Galles and Pearl (1997, 1998), as were their names. Roughly 
speaking, C3 says that if the value of W is w in all solutions to the equations Tg(u), then 
all solutions to the equations in Tg w (u) are the same as the solutions to the equations in 
Tg(u). C3 is valid in T as well as T^niq- As we shall see, a variant of C3 (obtained by 
replacing "all" by "some") is also valid in T. C4 simply says that in all solutions obtained 
after setting X to x, the value of X is x. C5 is perhaps the least obvious of these axioms; 
the proof of its soundness is not at all straightforward. It says that if setting Itof and W 
to w results in Y having value y and setting X to x and Y to y results in W having value 
w, then Y must already have value when we set X to x (and W must already have value 
w). 

Finally, it is easy to see that C6 holds in recursive models. For iiY^Z, then Y must 
precede Z in the causal ordering. Thus, if Xq ~> X\ A ... A A^,_i ^ X^, then Xq must 
precede X^ in the causal ordering, so Xk cannot affect Xq. Thus, ->(Xk ^ Xq) holds. As 
we shall see, in a precise sense, C6 characterizes recursive models. 

C6 can be viewed as a collection of axioms (actually, axiom schemes), one for each k. 
The case k = 1 already gives us —>(Y ~> Z) V —>(Z ~> Y) for all variables Y and Z. That 

3. In an earlier draft of this paper, where CI and C2 were introduced, CI was called "uniqueness". Galles 
and Pearl (1998) then adopted this name as well. In retrospect, this axiom really does not say anything 
about uniqueness. The axiom which does is D10, which will be discussed later. 

4. The assumption that 1Z(X) and V are finite is also necessary for the abbreviation X ~» Y used in C6 to 
be in £ U ni q ; however, we can replace C6 by the axiom scheme 



where Xi € TZ(Xi) for i = 1, . . . , k. That is, we essentially replace C6 by all its instances. This axiom is 
equivalent to C6 (although not as transparent) and can be expressed even if |V| is infinite or |72-(X)| is 
infinite for some X 6 V. 



~'(^i=o(Xi+ih i x i (ui) =ZiA (X i+1 )g. = z'i) A {X )t kXk (u k ) = z k A {X )^ k = z' k ), 
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is, it tells us that, for any pair of variables, at most one affects the other. However, just 
restricting C6 to the case of k = 1 does not suffice to characterize T reC} as the following 
example shows. 

Example 3.1: Let S = {<b,{X ,X 1 ,X 2 },K), where K(X ) = ft(Xi) = K(X 2 ) = {0,1,2}, 
and let T = (S,T), where Fx t is characterized by the equation 




otherwise 



and © is addition mod 3. It is easy to see that T £ T mi i q : If any of the variables are set, the 
equations completely determine the values of all the other variables. On the other hand, if 
none of the variables are set, it is easy to see that (0, 0, 0) is the only solution that satisfies all 
the equations. Moreover, in Tg^-, the variable Xi is unless it is set to a value other than 
or XiQi is set to 1. It easily follows that Xi is affected only by X^\. A straightforward 
verification (or an appeal to Theorem 3.2 below) shows that T satisfies all the axioms other 
than C6. C6 does not hold in T, since T \= Xq ~* X\ A X\ ~> X 2 A X 2 ~> Xq. This also 
shows that T is not recursive. However, the restricted version of C6 (where k = 1) does 
hold in T. A generalization of this example (with k random variables rather than just 2) 
can be used to show that we cannot bound k at all in C6; we need C6 to hold for all finite 
values of k. I 

Let AX uniq (<S) consist of C0-C5 and MP; let AX rcc (S) consist of C0-C4, C6, and MP. 
We could include C5 in AX rec (5); I did not do so because, as Galles and Pearl (1998) point 
out, it follows from C3 and C6. Note that the signature S is a parameter of the axiom 
system, just as it is for the language and the set of models. This is because, for example, 
the set 7Z(X) (which is determined by S) appears explicitly in CI and C2. 

Theorem 3.2: AX mi i q (S) (resp., AX Tec (S)) is a sound and complete axiomatization for 
Ainiq(>S) with respect to T uniq (S) (resp., T TCC (S)). 

Proof: See the appendix. | 

As I said in the introduction, Galles and Pearl (1998) prove a similar completeness result 
for causal models whose variables satisfy a fixed causal ordering. Given a total ordering -< 
on the variables in V, consider the following axiom: 

Ord. Y Sw (u) = Y s {u) ify -< W 

Since x, w, and u are implicitly universally quantified in Ord, this axiom says that ->(W ~> 
Y) holds if Y -< W. It follows that if W ~> Y, then W ~<Y. From this and the fact that -< 
is a total order, it is easy to see that Ord implies C6. 

Galles and Pearl show that C1-C4 and Ord is a sound and complete axiomatization 
with respect to the class of causal models satisfying Ord for £gp- More precisely, Galles 
and Pearl take Ac to consist of the axioms C1-C4 and Ord (but not CO or MP), and show, 
in their notation, that S \= a implies S ^a c a i where S U {a} is a set of formulas in Cqp- 
There is an important subtle point worth stressing about their result: CI and C2, which 
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are axioms in Ac, are not expressible in £qp (since their statement involves disjunction 
and negation). 

So what exactly is Galles and Pearl's result saying? They interpret S \= a, as usual, 
as meaning that in all causal models satisfying S, a is true. 5 They interpret S \~a c cr as 
meaning that a is provable from S and the axioms in the axioms of Ac "together with the 
rules of logic", which presumably means CO and MP. It follows easily from Theorem 3.2 
that their result is correct (see below), but it is unlike typical soundness and completeness 
proofs, since the proof of a from S will in general involve formulas in £ un i q that are not in 
£gp- (In particular, this will happen whenever C1-C3 are used in the proof.) 

To see that Galles and Pearl's result follows from Theorem 3.2, define S* to be the 
formula in £ uruq (<S) which is the conjunction of the formulas in S (there can only be finitely 
many, since Cqp(S) itself has only finitely many distinct formulas), together with the con- 
junction of all the instances of the axiom Ord (again, there are only finitely many). Note 
that S \= a holds iff T mi i q (S) \= S* => a (since the formulas in Ord guarantee that the only 
causal models that satisfy S* are recursive, and hence are in / 7 un iq (<?))• Thus, by Theo- 
rem 3.2, S \= a iff AX un i q (5) h S* =>■ a. The latter statement is equivalent to S \~a c a i 
as defined by Galles and Pearl. In fact, Theorem 3.2 shows that AX un i q (<S) + Ord gives 
a sound and complete axiomatization with respect to causal models satisfying Ord for the 
language £ un i q (S), which allows Boolean connectives. (Of course, Theorem 3.2 shows more, 
since it extends Galles and Pearl's result to T rec (S) and T un { q (S).) This suggests that £ U mq 
is a more appropriate language for reasoning about causality than £gp> at least for causal 
models in T un i q . Cqp cannot express a number of properties of causal reasoning of interest 
(for example, the ones captured by axioms C1-C3). When we use jC,\xniq> not only is every 
formula in £ U niq valid in T un \ q provable from the axioms in AX un i q , but the proof involves 
only formulas in £ U niq- 

What about T? I have not been able to find a complete axiomatization for the language 
£uniq with respect to T. However, I do not think that finding a complete axiomatization 
for £ un iq with respect to T is of great interest, because £ U niq is simply not a language 
appropriate for reasoning about causality in T. Because there is not necessarily a unique 
solution to the equations that arise in a causal model T G T, it is useful to be able to say 
both that there exists a solution with certain properties and that all solutions have certain 
properties. This is precisely what the language C + lets us do. 6 As I now show, there is in 
fact an elegant sound and complete axiomatization for C + with respect to T . 

Consider the following axioms: 

DO. All instances of propositional tautologies. 

Dl. [Y <- y\(X(u) = x X(u) / x') if x, x' G K(X), x + x' (functionality) 
D2. [Y <- y\(\/ xe K(x)X(u) = x) (definiteness) 

D3. (X <- x){W{u) = w A Y{u) = y) => (X «- x; W <- w){Y(u) = y) (composition) 

5. Although they do not say this explicitly, it is clear that they intend to further restrict to casual models 
satisfying S and Ord, for the fixed order -<;. Without this restriction, their result is not true. 

6. Note that £ + allows us to say that there is a unique solution for a random variable X after setting some 
other variables. For example, (Y <— y)true(u) A [Y <— y\(X(u) = x) says that there are solutions to the 
equations when Y is set to y and U is set to u and, in all of them, X is uniquely determined to be x. 
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D4. [W <— w;X <— x](X(u) = x) (effectiveness) 

D5. ((X <- x; Y <- = w A Z(u) = z) A {X ^ x;W ^- w)(Y(u) = y A = z)) 

=► (X <- z)W(u) = w A Y(u) = y A Z{u) = z)), where 2 = V-(IU{If, Y}) 

(reversibility) 

D6. (Xo ~» Xi A . . . A ~» X&) =>- ~> Xo) (recursiveness) 

D7. ([X <- ^(p A[X <- x\(ip => i))) => [X <- x\ip (distribution) 

D8. [X <— if (/? is a propositional tautology (generalization) 

D9. (Y - y)irae(u) A V^ (X) [Y <- y](X(«) = x) if Y = V - {X} 

(unique solutions for V — {X}) 

D10. (y <— y)true(u) A V x gR.(X)\y ^~ y\(X(it) = x) (unique solutions) 

Dll. (Y «- A ... A ¥> fc (« fc )) 44> ((f «- y)ipi(ui) A . . . A (Y *— y)y k {u k ), if ^(itj) 

is a Boolean combination of formulas of the form X(uj) = x and Uj / Uj for i ^ j 
(separability) 

D1-D6 are the analogues of C1-C6 in C + . D4 and D6 are just C4 and C6, with no 
changes at all. The other axioms are not quite the same though. For example, CI is 
actually [Y <— y\(X{u) = x) =>■ ->\Y <— y](X(u) = x') if x 7^ x'. By Lemma 2.2, this is 
equivalent to Dl in T un i q ; however, the two formulas are not equivalent in general. Similarly, 
C2 is V a;G 7^(x)[y <— y\{X{u) = x), which is closer to D10 than D2 (since the disjunction 

is outside the scope of the [Y <— y\). Again, D10 and D2 are equivalent in T nn i q (both are 
equivalent to C2 in this case) but, in general, D10 is stronger than D2. Only D2 and D9, 
both weaker than D10, hold in T. The exact analogue of C3 would use [] instead of ( ) and 
say Y(u) = y instead of Y{u) = y. For completeness, it is necessary to have a vector of 
variables here. Using [] instead of ( ) also results in a valid formula (and would not require a 
vector y). While the two variants are equivalent in T un i q , they are different in general, and 
the one given here is the more useful. (More precisely, with it we get completeness, while 
the version with [] does not suffice for completeness.) Similarly, in D5, we use ( ) instead of 
[], and add the extra clause Z(u) = z. Both turn out to be necessary for soundness. In some 
sense, we can think of D1-D6 as capturing the "true content" of C1-C6, once we drop the 
assumption that the structural equations have a unique solution. D7 and D8 are standard 
properties of modal operators. D10 is what we need to capture the fact that the structural 
equations have unique solutions. Dll essentially says that the solutions to the equations 
that arise when the exogenous variables are set to u are independent of the solutions that 
arise when the exogenous variables are set to v! 7^ u. 

Let AX + consist of D0-D5, D7-D9, Dll, and MP (modus ponens); let AX+ niq be the 
result of adding D10 to AX+; let AX+ C be the result of adding D6 to AX+ niq . 

Theorem 3.3: ^4X + (<S) (resp., AX^-(S), AXf cc (S)) is a sound and complete axiomati- 
zation for C + (S) with respect to T(S) (resp., T un \ q (S), % CC (S)). 

Proof: See the appendix. | 
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4. Decision Procedures 

In this section I consider the complexity of deciding if a formula is satisfiable (or valid). 
This, of course, depends on the language (C + , £ U niq, or £gp) and the class of models (T vec , 
%miq, T) we consider. It also depends on how we formulate the problem. 

One version of the problem is to consider a fixed signature S = (U,V,TZ), and ask how 
hard it is to decide if a formula p <G £ + (<S) (resp., £ un i q (<S), £gp(«S)) is satisfiable in 7^ ec (<S) 
(resp., T un i q (S), T(S)). If S is finite (that is, if V and U are finite and 1Z(Y) is finite for 
each Y G U G V), this turns out to be quite easy, for trivial reasons. 

Theorem 4.1: If S is a fixed finite signature, the problem of deciding if a formula ip G 
C + (S) (resp., £ UTl i q (S), Cgp(S)) is satisfiable in T TCC (S) (resp., T un i q (S), T(S)) can be 
solved in time linear in \p>\ (the length of if viewed as a string of symbols). 

Proof: If S is finite, there are only finitely many causal models in T(S), independent of 
p. Given <p, we can explicitly check if (p is satisfied in any (or all) of them. This can be 
done in time linear in \p>\. Since S is not a parameter to the problem, the huge number of 
possible causal models that we have to check affects only the constant. | 

We can do even better than Theorem 4.1 suggests if S is a fixed finite signature. Suppose 
that V consists of 100 variables and p mentions only 3 of them. A causal model must specify 
the equations for all 100 variables. Is it really necessary to consider what happens to the 
97 variables not mentioned in <p to decide if p is satisfiable or valid? As the following result 
shows, if we restrict to models in 7^ n iq) then we need to check only the variables that appear 
in S. Given a signature S = (U,V,1Z), let = ({U*}, V^,, 7^), where consists of the 
variables in V that appear in ip, U* is a fresh exogenous variable, not mentioned in V or U, 
TZ,p(X) = TZ(X) for X G V v , and TZ^(U*) consists of all those tuples in Xu e u7l(U) that are 
mentioned in p. 

Theorem 4.2: A formula ip G £ + (<S) is satisfiable in T rec (S) (resp., T un i q (S)) iff it is 
satisfiable inT iec (S v ) (resp., T^i^S^)). 

Proof: See the appendix. | 

The analogue to Theorem 4.2 does not hold for T. For example, suppose that S = 
(<b,{X,Y,Z},K), where K(X) = K{Y) = K(Z) = {0,1}, and p is the formula (X <- 
0)(Y = 0) A (X <— 0)(Y = 1). It is easy to see that there is a causal model in 7~(<S) 
satisfying <p. For example, if T = (S, J 7 ), where Fx(y,z) = y © z, Fy(x,z) = x © z and 
Fz(x,y) = x © y, and © represents addition mod 2, then it is easy to check that T \= p. 
On the other hand, there is no causal model T' G T(S V ,) such that T' \= p>. For suppose 
T \= p and T = (S^F). Since V \= {X *- 0)(Y = 0), we must have (0) = 0; since 
T' \= (X <- 0)(Y = 1), we must have F y (0) = 1. But we cannot have both F{r (0) = and 
Fy(l) = 1, since Fy is a function. 

There is a variant of Theorem 4.2 that does hold for T that does give us a bound 
on the number of variables we need to consider. Given a signature S = (U,V,1Z), define 
ll^ll = x xgv|^-(^)| (where we take = 00 if either V is infinite or |7£(X)| = 00 

for some X G V). If ||«S|| > ||5^|| 2 + \\S V \\, let S+ = ({U*}, V+ K+), where V+ is as 
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defined above together with one fresh endogenous variable X* , TV^{X*) = xxeV v TZ(X), and 
K+(U*) = TZ^U*). If ||5|| < H^H 2 + ||«S^||, let «S+ = ({U*},V,K'), where TZ'(X) = U{X) 
for X G V and TZ'(U*) = TZ^U*). 

Theorem 4.3: A formula ip G £ + (<S) is satisfiable in T(S) iff p> is satisfiable in T(<S+). 

Proof: See the appendix. | 

Note that if ||<S|| < H^lj 2 + ||<S<^||, then, since we have assumed (without loss of gen- 
erality) that \TZ(X)\ > 2 for each variable X, it must be the case that there are at most 
2^2(115^11) + 1 variables in signature S. 

Since Theorems 4.2 and 4.3 apply to all formulas in C + (S), they apply a fortiori to 
formulas in C mi i q (S) and Cqp(S). Although stated only in terms of satisfiability, it is 
immediate that they also hold for validity. Thus, they tell us that, without loss of generality, 
when considering satisfiability or validity, we need to consider only finitely many variables 
(essentially, the ones that appear in <p, and perhaps a few more). In this sense, we can 
restrict to signatures with only finitely many variables without loss of generality. Note that 
these results do not tell us that we can restrict to finite sets of values for these variables 
without loss of generality. 

Returning to the complexity of the decision problem, note that Theorem 4.1 is the ana- 
logue of the observation that for propositional logic, the satisfiability problem is in linear 
time if we restrict to a fixed set of primitive propositions. The proof that the satisfia- 
bility problem for propositional logic is NP-complete implicitly assumes that we have an 
unbounded number of primitive propositions at our disposal. 

There are two ways to get an analogous result here. The first is to allow the signature 
S to be infinite and the second is to make the signature part of the input to the problem. 
The results in both cases are similar, so I just consider the case where the signature is part 
of the input here. 

Theorem 4.4: Given as input a pair (ip, S), where ip G jC + (S) (resp., £ U niq(«S) ) and S is a 
finite signature, the problem of deciding if ip is satisfiable in % ec (S) (resp., T nn \ q (S), T(S) ) 
is NP-complete (resp., NP-hard) in \<p\; if ip G Cqp(S), then the problem of deciding if ip is 
satisfiable in% ec (S) (resp., T un i q (S)) is NP-complete (resp., NP-hard). 

Proof: See the appendix. | 

I believe that the problem of deciding if a formula (p in C un i q (S) or C + (S) is satisfiable in 
%iniq('S) and T(S) is NP-complete, as is the case of deciding if <p G £gp(S) is satisfiable in 
^uniq(<5). However, I have not been able to show this. What about the satisfiability problem 
for formulas in £qp i n T(S)1 This may well be in constant time! Indeed, if S is an infinite 
signature (that is, if S = (U, V, TV) and |V| = 00), then it is provably in constant time. The 
point is that a formula in Cqp(S) is trivially satisfiable in a structure T G Cqp(S) where 
for all settings X <— x, the equations in Tg^_- have no solutions, and there always is such 
model structure if S has infinitely many variables. If S has only finitely many variables, we 
do not have such trivial models, but it may still be possible to show that a "trivial enough" 
model exists that satisfies the formula. This just emphasizes that Cqp{S) is simply too 
weak a language to reason about models in T(S). 
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5. Conclusion 

I have provided complete axiomatizations and decision procedures for propositional lan- 
guages for reasoning about causality. I have tried to stress the important role of the choice 
of language (and the signature) in both the axiomatizations and, more generally, in the 
reasoning process. 

Both the models and the languages considered here are somewhat limited. For example, 
a more general approach to modeling causality would allow there to be more than one value 
of X once we have set all the other variables. This would be appropriate if we model things 
at a somewhat coarser level of granularity, where the values of all the variables other than 
X do not suffice to completely determine the value of X. I believe the results of this paper 
can be extended in a straightforward way to deal with this generalization, although I have 
not checked the details. For general causal reasoning, I believe we need a richer language, 
which includes some first-order features. I hope to return to the issue of finding appropriate 
richer languages for causal reasoning in future work. 
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Appendix A. Proofs 

Theorem 3.2: AX mi { q (resp., AX rcc ) is a sound and complete axiomatization for C mi i q (S) 
with respect to T un i q (S) (resp., % ec (S)). 

Proof: Soundness is proved by Galles and Pearl. To make the paper self-contained, I 
reprove the only non-obvious case — the validity of C5 in 7^ n i q . 

Let T € T^niq and suppose that T \= Y$ w (u) = y A Wg y (u) = w. We want to show 
that T |= Yg(u) = y. Since we are in T un i q , there is a unique vector v\ that satisfies the 
equations in Tg w (u) and a unique vector V2 that satisfies the equations in Tg y (u). I claim 
that v\ = V2- By assumption, the X, Y, and W components of these vectors are the same 
(x, y, and w, respectively). Now consider the Tg yw (u). I claim that v\ and V2 are both 
solutions to the equations in that causal theory. Note that for any variable Z other than 
those in X U {W, Y}, the equation jp^ w ' u f or z in Tg w (u) is the same as the equations F^ v ' u 

and F^ w,u for Z in Tg y (u) and T$ yw (u), respectively, except that in the first case, w has 
been plugged in as the value of W, in the second case y has been plugged in as the value of 
Y, and in the third case, both w and y have been plugged in. However, since w and y are 
the values of W and Y, respectively, in both v\ and V2, and since these vectors satisfy both 
equation F| w and F^ y , they must also satisfy F% wy . Since the equations for Tg yw (u) have 
a unique solution, we have that v\ = v~2, as desired. 
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Next, I claim that v\ satisfies the equations in Tg(u). Again, as above, it is clear that 
it satisfies the equation for Z £ X\J {W, Y}. A similar argument shows that it satisfies the 
equation for Y in Tg(u), since v\ satisfies the equation for Y in Tg w (u). Finally, a similar 
argument shows that it satisfies the equation for W in Tg(u), since V2 = v\ satisfies the 
equation for W in Tg y {u). Since the Y component of v\ is y, it follows that Yg(u) = y. 

So much for soundness. For completeness, as usual, it suffices to prove that if a formula 
in £ un iq is consistent with AX un i q (resp., AX rec ), then it is satisfied in a causal model in 
Tuniq (resp., % cc ). (Here's the argument: We want to show that every valid formula is 
provable. Suppose that we have shown that every consistent formula is satisfiable and that 
<p> is valid. If ip is not provable, then -199 is consistent. By assumption, this means that -></? 
is satisfiable, contradicting the assumption that p is valid.) 

I now give the argument in the case of AX un ; q . 

Suppose that a formula ip G £ un i q («S), with S = (U,V,V), is consistent with AX un i q . 
Consider a maximal consistent set C of formulas that includes p. (A maximal consistent set 
is a set of formulas whose conjunction is consistent such that any larger set of formulas would 
be inconsistent.) It follows easily from standard propositional reasoning (i.e., using CO and 
MP only) that such a maximal consistent set exists. Moreover, from CI and C2, it follows 
that for each random variable X G V and vector y of values, there exists exactly one element 
x G Tt{X) such that X$ = x G C. I now construct a causal model T = (S,F) G T nn [ q (S) 
that satisfies every formula in C (and, in particular, satisfies ip). 

A term X^^_-(u) is complete (for X) if Y consists of all the variables in V — {X}. Thus, 
X^^_~{u) is a complete term if every random variable other than X is determined. We use 
the complete terms to define the structural equations. For each variable in X G V, define 
Fx(u,y) = x if Xg(u) = x, where X$(u) is a complete term. This gives us a causal model 
T. Now we have to show that this model is in T uniCL and that all the formulas in C are 
satisfied by T. 

I show that X^^_^(u) = x is in C iff T |= Xf^_-(u) = x by induction on |V| — \Y\. The 

case where |V| — \Y\ = follows immediately from C4, since then X is in Y. If |V| — |y| 7^ 0, 
we can assume without loss of generality that X is not in Y, for otherwise the result again 
follows from C4. Given this assumption, if |V| — \Y\ = 1, the result follows by definition of 
the equations Fx- 

For the general case, suppose that |V| — \Y\ = k > 1. We want to show that there is a 
unique solution to the equations in T^^~(u) and that, in this solution, X has value x. To 
see that there is a solution, we define a vector v and show that it is in fact a solution. If 
W G Y and W <— w is the assignment to W in Y <— y, then we set the W component of v 
to w. If W is not in Y, then set the W component of v to the unique value w* such that 
W-p^_-(u) = w* is in C. (By CI and C2 there is such a unique value w.) I claim that v is a 
solution to the equations in T-g.^_-(u). 

To see this, let W be a variable in V not in Y. Let Y' = YW. By C3 and C4, for every 
variable Z G V — Y', we have Z^ w *(u) = z* . Since |V| — \Y'\ = k — 1, by the inductive 
hypothesis, v is in fact the unique solution for Tg w * (u). For every variable Z in V — Y' , the 
equation Fp" for Z in T$ w *{u) is the same as the equation Fp u for Z in Tg(u), except 
that W is set to w*. Thus, every equation in Tg(u) except possibly the equation F^ u is 
satisfied by v. To see that F^} u is also satisfied by v, simply repeat this argument above 
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starting with another variable W' in V — Y. (Such a variable must exist since |V| — \Y\ was 
assumed to be at least 2.) 

It remains to show that v is the unique solution to the equations in Tg{u). Suppose 
there were another solution, say v', to the equations. Suppose that for each variable W 
in V — Y, the W component of v 1 is w** . For some variable Z, we must have z** 7^ z*. 
Since Z^{u) = z* , by assumption, it follows from CI that Zg{u) ^ z** is in C (since C is 
a maximal consistent set). It is also easy to see that for each W in V — Y, the vector v 1 is 
also a solution to the equations in Tg w **(u). Let W be a variable other than Z in V — Y. 
By the induction hypothesis, it follows that Wg z **(u) = w** and Zg w **(u) = z** are both 
in C. By C5 (reversibility), Zg(u) = z** is in C. But this contradicts the consistency of C. 

This completes the proof in the case of TumqiS). Essentially the same proof works for 
% cc . We just need to observe that C6 guarantees that the theory we construct can be taken 
to be recursive. To see this, given a formula <p consistent with T rec , consider a maximal set 
C of formulas consistent with % ec that contains ip. Let Tq be the causal model determined 
by C, as above. The set C also determines a relation -< on the exogenous variables: define 
Y<Z\iY-^Z^C. It easily follows from C6 that the transitive closure -<* of -< is 
a partial order: if X -<* Y and Y -<* X, then X = Y. Any total order on the variables 
consistent ^* gives an ordering for which Tc is recursive. I 

Theorem 3.3: AX + (resp., AX^^, AX^ CC ) is a sound and complete axiomatization for 
C + (S) with respect to T(S) (resp., ^ n iq(>S), %ec{S))- 

Proof: Soundness proceeds much as that of Theorem 3.2; I leave details to the reader. For 
completeness, we again proceed much as in the proof of Theorem 3.2. Because the proofs 
are so similar in spirit, I just sketch the proof for AX + ; the modifications for AX+ niq and 
AX+ C are left to the reader. 

Again, given a formula ip consist with AX + , we consider a maximal consistent set of 
formulas containing ip that is consistent with AX + , and use it to construct a causal model 
T. Note that D9 suffices for this, because in defining Fx(u, y), we needed to know only the 
unique x such that [Y <— y\(X(u) = x) for Y = V — X, and D9 (together with Dl) assures 
us that there is a unique such x. Again, we want to show that all the formulas in C are 
satisfied by T. 

To do this, it clearly suffices to show that for every formula ip of the form (Y <— yip, 
we have ip in C iff T \= ip. We can reduce to considering even simpler formulas, namely, 
ones where ip has the form X(u) = x, by applying some of the axioms. To see this, first 
observe that standard arguments of modal logic (using DO, D7, D8, and MP) show that 
(Y <— y)(<pi V P2) is provably equivalent to (Y <— y)<p>\ V (Y <— y)ipi- That means we can 
assume without loss of generality that <p is a conjunction of formulas of the form X(u) <— x 
and their negations. From D2 it follows that (Y <— y)(ip A X(u) / x) is equivalent to 
(Y <— y)(p A (yx'eii(x)-{x}X(u) = x'). Thus, we can assume without loss of generality that 
(p has no negations. By applying Dll, we can assume without loss of generality that the 
same setting u of the exogenous variables is used in all the conjuncts. Thus, it suffices to 
show that (Y «- y)(X(u) = x) € C iff T \= (Y «- y)(X(u) = x) for X = V - Y. 

To do this, we proceed by induction on |V| — |V| again. The base case is dealt with using 
D4, as before. So assume that k > 1 and |V| — \Y\ = k + 1. Suppose that (Y <— y)(X(u) = 
x) £ C. Let X±,X2 € X. Suppose that X\ <— x\ and X2 <— X2 are the assignments to 



331 



Halpern 



X\ and X2 in X <— x. Let X' <— x" and X" <— x" be the result of removing X\ <— x\ 
and X2 <— X2, respectively, from X <— x. By D3, both (Y <— y;X\ <— xi)(X"(«) = x") 
and (y *— y;X2 <— X2)(X'(ti) = x') are in C. By the induction hypothesis, both of these 
formulas are true in T. By the soundness of D5, it follows that T \= (Y <— y)(X(u) = x'), 
as desired. 

Conversely, suppose that T \= (Y <— y){X{u) = x'). Then, since D3 is sound, we have 
that T \= (Y <- y;X 1 <- Xl )(X"(u) = x") and T \= (Y <- y;X 2 <- x 2 )(X'(u) = x'). 
By the induction hypothesis, we have that both (Y <— y;Xi <— x\)(X"(u) = x") and 
(y <— y;X2 <— X2){X'(u) = x') are in C. We now apply D5 to complete the proof. I 

Theorem 4.2: A formula ip € C + (S) is satisfiable in % CC {S) (resp., T mi i q (S)) iff it is 
satisfiable in% cc (S v ) (resp., / 7^ niq (5 (/ ,) y ). 

Proof: Clearly, if a formula is satisfiable in 7^(5^) (resp., T^niq («?«))) then it is satisfiable 
in T Tec (S) (resp., T un i q (S)). We can easily convert a causal model T = (S v , J 7 ) € % C c(S(p) 
satisfying ip to a causal model T' = {S,T') £ T rcc (S) satisfying ip by simply defining 
F' x to be a constant, independent of its arguments, for X G V — V^; if X £ V^, define 
Fx{u,x,y) = F x (u,x), where u G TZ(U*), x G x yeV ^_{ X }^(y) and y G Xy e v-V,^); 
if n ^ 1Z(U*), define F' x {u,x,y) to be an arbitrary constant. An identical transformation 
works for T G / 7I m iq(5 ¥ ,). 

For the converse, suppose that 99 is satisfiable in a causal model T = (5,JF) G 7^ cc (5). 
Thus, there is an ordering ~< on the variables in V such that if X -< Y, then Fx is indepen- 
dent of the value of Y. This means we can view Fx as a function of the exogenous variables 
in U and the variables Y G V such that y -<! X. Let Pre(X) = {Y £ V : Y -< X}. For 
convenience, I allow F x to take as arguments the values of only the variables in UUPre(X), 
rather than requiring its arguments to include the values of all the variables in U U V — {X} . 
Now define functions F' x : (x UeU K(U)) x (x YeVtfi _ {x] TZ(Y)) -» K(X) for all X G V by 
induction on -< (that is, start with the -(-minimal element, whose value is independent of 
that of all the other variables, and work up the -< chains). Suppose X G V v and x is a vector 
of values for the variables in V v — {X}. If X is -^-minimal, then define F x (u,x) = Fx{u)- 
In general, define F' x (n, x) = F x (u, z) , where z is a vector of values for the variables in 
Pre(X) defined as follows. If Y G Pre(X) n V v , then the value of the Y component in z is 
the value of the Y component in y; if Y G Pre(X) — V^, then the value of the Y component 
in z is Fy(u,x). (By the induction hypothesis, Fy(u,x) has already been defined.) Now 
define a causal model T' = (Su,,J-'). It is easy to check that T' G % C c{S v ) (the ordering 
of the variables is just -< restricted to V^). Moreover, the construction guarantees that if 
X C Vtp, then the solutions to the equations T' x (u) and Tg^iu) are the same, when 
restricted to the variables in V^. It follows that T' satisfies p. 

The argument in the case that T G 7 un iq('5) is similar in spirit. For X G Vu,, u G 
( x Ueu'R.(U)), and x G (xy£\)^-{ X }TI(Y)), define F x (u,x) to be the value of X in the 
unique solution to the equations in T Vv _^ x y^g(u). 7 It is again straightforward to check 
that now T' = (S^^T') G T un i q (S^) and satisfies p. | 

7. This definition is easily seen to agree with the earlier definition of Fx if T £ T lec . 
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Theorem 4.3: A formula ip G C + (S) is satisfiable in T(S) iff ip is satisfiable in T(S£). 

Proof: If||«S|| < | \Stp | | 2 + 1 \S V \ | then the proof is immediate, so suppose that ||5|| > ||5 (p || 2 + 
1 15^11 and tp is satisfied in a causal model T = (S,^) G F{S). Before going on with the 
proof, it is useful to define some notation. Let V = {X±, . . . , X m }, where = {.Xi, • • • , X^} 
and V - = {X k+1 , . . . ,X m }. Given a vector x G TZ(X*) = x X ev v ^(X) and X t G V v , 
let X-i denote the vector excluding the value for JQ. For each Xi G V v , choose two values 
Xio and xn in 7£(JQ). Define T' = (S^,^ 1 ) by defining F' x {u, x_j,y_j,yj) = x, where 

• x = yi if x_j = y_j and X = y; L in some solution to the equations in Ty ).<_£_. (-u); 

• x = Xio if yi ^ Xio and either x-i ^ y_j or there is no solution to the equations in 
T V v -{X}^x-i(ti) in whicn x = yf, 

• x = xn otherwise. 

Finally, define Fx* (u, x) = x. 

I now show that the construction again guarantees that if X C V^, then the solutions 
to the equations T'^^{u) and T^^ £ (u) are the same, when restricted to the variables in 
Vip. First suppose that (y, z) is a solution to the equations in T^^ £ (u), where y G 1Z(X*) 
and i* G Xygy.y^ 7^.(1"). It must be the case that x and y agree on the variables in X, 
so (y,z) is also a solution of the equations in _^x i }^y_ i (u) if Xi G V v — X. Thus, 
F x .(u,y-i,y) = yi- It follows that (y, y) is a solution to the equations in T'~ ~(u)- 

Conversely, suppose that (y, y") is a solution to the equations in Tj- («). Then the 

definition of Fx* guarantees that y = y 1 . Moreover, since x and y agree on the variables in X, 
(y, y) must also be a solution to the equations in Ty _^ i ^ < _-_ i (n). Thus, F Xl (u,y-i,y) = 
yi, which means that there must be some vector z of values for the variables in V — such 
that (y,z) is a solution to the equations in T Vif) _^ Xl }^y_ 1 (u). But then it is easy to check 
that (y, z) must in fact be a solution to the equations in T Vif> _^ Xi }<-y_ i (u) for alH = 1, . . . , k. 
It follows that (y,z) is a solution to the equations in T-^^_ s (u), as desired. This suffices to 
prove this direction of the theorem. 

Now suppose that ip is satisfied in a causal model T = (S^,T) G T(<S+). Since ||«S|| > 
II^H 2 + \\Stp\\, there must be an injective function / : 1Z(X*) — > xy g y_y v TZ(Y) and two 
distinct vectors y = (yoi, ■ ■ ■ ,2/ofc), yi = (yn, • • • ,2/ifc) that are not in the range of /. 
Choose two distinct vectors xq = (xiq, . . . , x^q), x\ = (xn, . . . , x^i) G 1Z(X*). Define 
T' = {S,F') G T[S) as follows. If X { G V v , X-i G x YeVv _ {Xt} TZ(Y), z G K(X*), and 
y x Y eV-v v K{Y), let 



F Xi (x-i,z) if f(z) = y, 

xoi if y is not in the range of /, y ^ y±, 

xu otherwise. 



If Xj G V - V^, x G 7e(X*) and y_j G Xy eV _ Vv _ {x . } ?J(y), then let 

iff(F x *(x)) = (y- j ,y), 
F Xi (x, y-j) = { y 0j if f(F x * (£)) ^ y') for all y' G f + af , 

otherwise. 
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Again, I show that the construction guarantees that if X C V v , then the solutions to the 
equations _{u) and Tg^-{u) are the same, when restricted to the variables in V v . First 
suppose that (y, z) is a solution to the equations in T^^^(u), where y,z£ 1Z(X* ). It is easy 
to check that (y, f(z)) is a solution to the equations in Tj- (it). Conversely, suppose that 
(y, z) is a solution to the equations in T'-^(u), where y G 1Z{X*) and z G xy e y_y v 7£(F). 
I claim that we must have z = f(Fx*(y)). If, in fact, this is the case, then it is easy to 
check that (y,Fx*(y) is a solution to the equations in T^^ £ (u). On the other hand, if 
z 7^ f(Fx*(y)), then the definition of F' x . for Xj G V — guarantees that z = yo unless 
y = xo; if y = xq, then z = y\. But the definition of Fsq for Xj G V v guarantees that if 
z = yb, then y = xo: otherwise, y = a?i. Thus, (y, z) is a solution iff z = f(Fx*(y))- This 
suffices to prove the result. I 

Theorem 4.4: Given as input a pair (ip,S), where ip G £ + {S) (resp., C un i q (S)) and S is 
a finite signature, the problem of deciding if ip is satisfiable with respect to % ec {S) (resp., 
^uniq(<5), T(S)) is NP-complete (resp., NP-hard) in \ip\; if ip G Cqp(S), then the problem 
of deciding if ip is satisfiable in% ec {S) (resp., T un i q (S)) is NP-complete (resp., NP-hard). 

Proof: The NP-lower bound is easy for C + (S) and £ un i q (<S), since there is an obvious way 
to encode the satisfiability problem for propositional logic into the satisfiability problem for 
C + and £ U niq- Given a propositional formula ip with primitive propositions p±, . . . ,p^, let 
<S = (0, {Xi, . . . , Xk},lZ), where TZ(Xi) = {0, 1} for i = 1, . . . , k. Replace each occurrence 
of the primitive proposition pi in <p with the formula Xj = 1. This gives us a formula iff in 
£ un i q (5). It is easy to see that if ip' is satisfiable in a causal model T G T{S) (and, a fortiori 
if iff is satisfiable in a causal model T in either 7^ ec (5) or T un i q (S)) then the solution to the 
equations in T defines a satisfying assignment for <p. Conversely, if (f is satisfiable, say by 
some truth assignment v, then we can trivially construct a causal model T G % ec (S) such 
that Fxi = v{pi). (For simplicity, I assume that valuations assign values and 1 rather 
than false and true.) 

This trivial construction of ip' will not work for £gp(S), since we do not have disjunctions 
or negations available. The lack of negations does not cause a problem. We can assume 
without loss of generality that the negations occur only in front of primitive propositions, 
and we can capture ->pi by the formula Xj = 0. The idea for dealing with disjunctions is that 
a formula such as p\ V ->p2 Vp3 is translated to [Xi <— 0; X2 <— 1; X3 <— 1](Y = 0), where Y 
is a fresh variable. Essentially, we are viewing p± V —>p2 V7J3 as (—>pi f\p2/\~ [ P3) false, which 
is why we write, for example, X\ <— even though p\ appears positively in the disjunction. 

To make matters simpler, assume that ip is a formula in 3-CNF. This suffices for NP- 
hardness, since the satisfiability problem for 3-CNF formulas is also NP-hard (Garey & 
Johnson, 1979). Suppose ip is of the form c\ A. . . Ac m , where each q is a clause consisting of a 
disjunction of three primitive propositions and their negations. Suppose that the primitive 
propositions that appear in ip are pi,...,pk- Let S = (0, {Xi, . . . , X&, Y\, . . . , Y m }, TV), 
where TZ(Xi) = TZ(Yj) = {0, 1} for all Suppose that Cj, the jth clause of ip, is of the 
form qji V qj2 V ^3, where qji is either pj i or -ip^ for some ji. Let c*- be the £qp formula 

[X h x jl ;X j2 <- x j2 ;X j3 <- Xjz]{Yj = 0), 
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where Xjh is if qjh is Pj h and Xjh is 1 if qjh is ~ [ Pj h for h = 1,2, 3. Let <// be 

[inte](Yi = 1 A . . . A Y m = 1) A c* A . . . A <4- 

I claim that <p is a satisfiable propositional formula iff the £gp formula is satisfiable in 
%cc{S) (resp. 7^niq(<S)). First suppose that tp' is satisfiable, say in some model T G T un i q (S). 
(If this direction holds for T G 7^ n i q (5), it clearly holds a fortiori for T G % ec (S).) Let i*be 
the unique solution to the equations in T. By construction, the Yj component of z is 1 for 
j = 1, . . . , m. Let x* be the value of the Xi component in z. Consider the valuation v such 
that v(pi) = x*. I claim that v(ip) = 1. To see this, suppose that clause Cj is qji V qj2 V q^. 
If u makes qji, qj2, and false, then we must have Xjh = x* h for h = 1,2,3. Since 
T \= [Xj 1 <— Xj\;Xj 2 <— Xj2', Xj 3 = Xjz)\(Yj = 0) and the value of the Xj h component of z is 
x jh for h = 1,2, 3, it follows that zis a solution to the equations in TV 

But this contradicts the fact that T \= [Xj 1 <— Xj\;Xj 2 <— Xj2',Xj 3 <— Xj3](l^- = 0) (since 
the component of z is 1). It follows that v(cj) = v(qj\ V qj2 V ^3) = 1. Since this is true 
for all clauses Cj, we must have that v{(p) = 1. 

For the converse, suppose that ip is satisfiable, say by valuation v. I show that ip' is 
satisfiable in T G T ICC (S). Order the variables so that Xj 1 ,Xj 2 ,Xj 3 -< Yj. (There are many 
orderings of the variables that satisfy these constraints; any one will do.) Define Fx t = v(pi) 
(so that is a constant, independent of its arguments); define Fy.(xj 1 ,Xj 2 ,Xj 3 ) = 1 if 
(xj 1 ,Xj 2 ,Xj 3 ) = (v(pj 1 ),v(pj 2 ),v(pj 3 )) and otherwise. It is easy to check that T \= (p', as 
desired. 

For the NP upper bound in the case of T rcc (S), it clearly suffices to deal with ip G C + . 
Suppose we are given (ip, S) with (p G C + . We want to check if ip is satisfiable in % ec {S). 
The basic idea in to guess a causal model T and verify that it indeed satisfies ip. There 
is a problem with this though. To completely describe a model T, we need to describe 
the functions Fx- However, there may be many variables X in S and they can have many 
possible inputs. Just describing these functions may take time much longer than polynomial 
in <p. Part of the solution to this problem is provided by Theorem 4.2, which tells us that 
it suffices to check whether ip is satisfiable in T rec (5 ¥ ,). In light of this, for the remainder 
of this part of the proof, I assume without loss of generality that S = S^. This limits the 
number of variables that we must consider to 0(|<^|). But even this does not solve our 
problem completely. Since we are not given any bounds on [7£(Y)| for variables Y in 
even describing the functions Fy for the variables Y that appear in ip on all their possible 
input vectors could take time much more than polynomial in <p. The solution is to give only 
a short partial description of a model T and show that this suffices. 

Consider all pairs (1" <— y, u) such that there is a subformula of ip of the form [Y <— y\ip 
and u appears in ip. Let R be the set of all such pairs. Note that \R\ < \tp\ 2 . We say that 
two causal models T and T' in 7^. cc (5) agree on R if, for all pairs (Y <— y,u) G R, the 
(unique) solutions to the equations in T Y ^^(u) and TL J^u) are the same. It is easy to see 

that if T and T agree on R, then either both T and T' satisfy <p or neither do. That is, all 
we need to know about a causal model is how it deals with the relevant equations — those 
corresponding to pairs in R. 

For each pair (1" <— y, u) G R, guess a vector v(Y <— y, u) of values for the endogenous 
variables; intuitively, these are the unique solutions to the relevant equations in a model 
satisfying T. Given these guesses, it is easy to check if ip is satisfied in a model where these 
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guesses do indeed represent the solutions to the relevant equations. It remains to show that 
there exists a causal model in T rec (S) where the relevant equations have these solutions. 

To do this, first guess an ordering -< on the variables. We can then verify, for each 
fixed u that appears in <p, whether the solution vectors v(Y <— y, u) guessed for the relevant 
equations are compatible with -<, in the sense that it is not the case that there are two 
solutions (u, x) and (u, x') such that some variable X takes on different values in x and x', 
but all variables Y such that Y -< X take on the same values in x and x' . It is easy to 
see that if the solutions are compatible with -<, we can define the functions Fx for X € V 
such that all the equations hold and Fx is independent of the values of Y if X -< Y for all 
X, Y € V. (Note we never actually have to write out the functions Fx, which may take 
too long; we just have to know they exist.) To summarize, as long as we can guess some 
solutions to the relevant equations such that a causal model that has these solutions satisfies 
(f, and an ordering -< such that these solutions are compatible with -<, then ip is satisfiable 
in % ec (S). Conversely, if tp is satisfiable in T £ % CC (S), then there clearly are solutions 
to the relevant equations that satisfy <p and an ordering -< such that these solutions arc 
compatible with -<. (We just take the solutions and the ordering -< from T.) This shows 
that the satisfiability problem for % cc is in NP, as desired. I 
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